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Abstract 

The General Single-Dish Data format (GSDD) was developed in the mid-1980s as a data model to support centimeter, mil¬ 
limeter and submillimeter instrumentation at NRAO, JCMT, the University of Arizona and IRAM. We provide an overview of the 
GSDD requirements and associated data model, discuss the implementation of the resultant file formats, describe its usage in the 
observatories and provide a retrospective on the format. 
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1. Introduction 

In the late 1970s and early 1980s millimeter and submil¬ 
limeter single-dish astronomy was undergoing a significant pe¬ 
riod of growth (see e.g., Robson, 2013) with the National 
Radio Astronomy Observatory (NRAO) 12-m telescope lead¬ 
ing the way (see e.g., Gordon, 2005) and with multiple ob¬ 
servatories being developed such as the Institut de Radioas- 
tronomie Millimetrique (IRAM) 30-m (Baars, 1981), the 15- 
m James Clerk Maxwell Telescope (JCMT; Hills, 1985), the 
10-m Sub-Millimeter Telescope (SMT; Wilson, 1985), the 15- 
m Swedish European Southern Obsevatory Submm Telescope 
(SEST; Delannoy, 1985), and the Caltech Submillimeter Obser¬ 
vatory (CSO; Phillips, 1988). In this environment it was recog¬ 
nized by some institutions that the ability for raw, or partially 
processed data taken on one telescope, to be reduced and an¬ 
alyzed by the software written at another telescope would be 
extremely useful and could lead to signihcant savings on soft¬ 
ware development effort. 

At this time the Flexible Image Transport System (FITS; 
Wells et ah, 1981) was considered mainly suitable as a means of 
exchanging image data using tapes (Greisen et ah, 1980). The 
FITS standard, which then lacked the capability to use binary 
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tables and could only store a single ASCII table per hie, was 
not deemed an efficient format to store complex mm/submm 
time-series and spectral-line data from single-dish telescopes 
that usually required many sets of tabular data. 

The General Single Dish Data format (GSDD) was devel¬ 
oped in the 1980s to solve the data processing and acquisi¬ 
tion requirements of the NRAO, IRAM, University of Arizona 
and JCMT observatories. Initial discussions between NRAO 
12m and IRAM staff began in 1983, and subsequently included 
JCMT representatives. At around this same time, however, 
IRAM started development of the Continuum and Line Anal¬ 
ysis Single-dish Software class^ (Pety, 2005, ascl: 1305.010) 
data reduction package, and they did not follow up on the 
GSDD initiative.^ The GSDD format, agreed in 1986 (see 
e.g., Fairclough et ah, 1987; Stobie, 1987), consisted of a data 
model for specifying centimeter, millimeter and submillimeter 
observations (continuum and spectral-line instrumentation) and 
a specihcation of how the bytes would be represented on disk. 
The format was described in both JCMT technical notes (Fair¬ 
clough et ah, 1987; Jenness et ah, 1999; Scobbie, 1994) and 
an NRAO Newsletter article (Stobie, 1987), but a formal deh- 


^http://www.iram.fr/IRAMFR/GILDAS 
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Table 1: Base classes defined for GSDD. The final two classes listed, 14 and 
55, were only defined at JCMT. 


Class number 

Class name 

1 

Basic Information 

2 

Pointing Parameters 

3 

Observing Parameters 

4 

Positions 

5 

Environment 

6 

Map Parameters 

7 

Data Parameters 

8 

Engineering Parameters 

9 

Telescope Dependent Parameters 

10 

Open Data Reduction Parameters 

11 

Phase Block 

12 

Receiver configurations 

13 

Data Values 

14 

Pointing History (JCMT) 

55 

Inclinometry (JCMT) 


nition of the format was not published in the literature. In this 
article we present the first joint NRAO/JCMT description of the 
model and provide a retrospective on the history and usage of 
the format. A basic introduction to millimeter and submillime¬ 
ter observing techniques is beyond the scope of this paper but 
good background information can be provided by Stanimirovic 
et al. (2002). 

2. Data Model 

To allow interoperability of data files between differing ob¬ 
servatories it was important to develop a shared data model. 
The initial approach was to define the simplest possible model 
to allow sharing of raw, or partially reduced spectra between 
multiple data reduction software packages. Since JCMT was 
still in the development phase during these discussions the fo¬ 
cus became how to represent the raw instrument data on disk. 
This was simplified somewhat by the JCMT system not storing 
individual on-source and off-source or calibration data, but stor¬ 
ing calibrated spectra from the heterodyne systems and chop 
subtracted time-series for the continuum instruments. 

The model was designed to handle general sub-mm observ¬ 
ing techniques using different switching techniques, such as po¬ 
sition switching, beam switching and frequency switching, and 
included on-the-fly mapping techniques (where the telescope is 
moved during acquisition) as well as stare and gridded observa¬ 
tions. 

When designing the model related items were grouped into 
numbered classes and the parameter name was prefixed by that 
class number. The class groupings are shown in Table 1. In 
early JCMT documents (e.g. Fairclough et al., 1987; Fairclough 
and Padman, 1985) there is disagreement in the class number¬ 
ing, for example using S2EPH”* or C3EPH for the epoch of the 


^In early iterations S was used to indicate a scalar item and V a vector item. 


coordinates rather than C4EPH, reflecting the uncertainty in the 
standardized model, but eventually (see e.g. Scobbie, 1994) the 
NRAO convention was adopted and the core model solidified 
(Stobie, 1985, defined the NRAO naming scheme). Seventy 
one data items were defined in the shared NRAO/JCMT GSDD 
data model.^ and those are detailed in Table 2. For example 
C3DAT referred to the UT date of the observation, CISNO the 
scan number, and C7VR the source radial velocity. 

At the JCMT these GSDD names (known locally as the 
“NRAO” names) were written to disk files but were mapped 
to local equivalents in the acquisition computers. For exam¬ 
ple C12RF, the rest frequency, mapped to FE_NUREST in the 
acquisition shared memory system and was equivalent to the 
RESTFREQ FITS keyword. A full list of the equivalences for 
JCMT can be found elsewhere (Jenness et al., 1999; Scobbie, 
1994). As commissioning took place, new instrumentation ar¬ 
rived and new facilities were added, the JCMT data model di¬ 
verged with many new items being added without consultation 
with NRAO. These items are listed in Tables 3, 4 and 5. Class 
55 (Inclinometry) is not included here as the inclinometry data 
were not archived and therefore data files describing these ob¬ 
servations are extremely rare. 

One feature of the GSDD design was that some classes were 
explicitly reserved for local use. Class 9 was used for tele¬ 
scope dependent parameters and the defined set differed be¬ 
tween Green Bank and the 12m with JCMT adopting a single 
item, C90T from the 12m. 

The NRAO implementation, not including class 9, includes 
26 items not found in the JCMT version and these are given 
in Table 6. The following list - which is not intended to be 
exhaustive - details the main discrepancies and major compat¬ 
ibility problems between the NRAO and GSDD data models. 
The first part describes in detail the items found in the NRAO 
model but not implemented at JCMT; 

CIDLN CIHLN are not needed at JCMT because the length of 
the header region and the length of the data region are en¬ 
coded in the file format design. 

CISNA is the source object name and exists as two separate 
items at JCMT, ClSNAl and C1SNA2, to allow the object 
name to be specified in two parts or with an alternative 
name given. ClSNAl is the primary source name and is 
equivalent to the OBJECT FITS keyword. Historically the 
alternate or secondary part of the name was rarely used at 
JCMT so the name change, in hindsight, turned out to be 
unnecessary. 

C2PC was used at NRAO to specify a four-element secondary 
pointing correction. The JCMT version specifies this as 
four discrete scalar items, C2PC1 to C2PC4, rather than us¬ 
ing an array. 


followed by the class number. 

^72 if the telescope-specific C90T, Observing Tolerance, item is included 
which was present in the NRAO 12m definition and at JCMT but not used for 
Green Bank. In some very eai'ly files JCMT en'oneously used C90T for this 
item due to a transcription en'or confusing the letter “O” with the number zero. 
This sometimes implies that JCMT used class 90. 
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Table 2: Core components of GSDD data model present in both NRAO and JCMT implementations (Stobie, 1985). Relevant units are given in square brackets 
using the NRAO convention. JCMT data files contain unit information explicitly. 


CIBKE 

Backend 

C4CSC 

Code for coordinate sys¬ 
tem 

Epoch declination [deg] 

C6YGC 

Starting Y grid position 

CIDP 

Precision of the data in bits 
and data type 

C4EDC 

C6YNP 

Number of Y grid points 

CIOBS 

Observer Initials 

C4EL 

Elevation at C3UT [deg] 

C7BCV 

Bad channel value 

CIONA 

Observer name 

C4EPH 

Epoch of coordinates 
[years] 

C7CAL 

Calibration type 

CIPID 

Project ID 

C4ERA 

Epoch Right Ascension 
[deg] 

C70SN 

Calibration scan/observa¬ 
tion number 

CIRCV 

Frontend 

C4GB 

Galactic Latitude [deg] 

C7VC 

Velocity correction [km/s] 

CISNO 

Scan/Observation number 

C4GL 

Galactic Longitude [deg] 

C7VR 

Radial velocity of source 
[km/s] 

CISTC 

Type of observation 

C4RX 

Reference X position [deg] 

C7VRD 

Velocity definition code 

CITEL 

Telescope name 

C4RY 

Reference Y position [deg] 

C8AAE 

Aperture efficiency 

C2FL 

EW focus 

C4SX 

Source X [deg] 

C8ABE 

Beam efficiency 

C2FR 

Radial focus 

C4SY 

Source Y [deg] 

C8EF 

Forward spillover & scat¬ 
tering efficiency 

C2FV 

NS focus 

C5AT 

Ambient Temperature [°C] 

C8EL 

Rear spillover & scattering 
efficiency 

C20RI 

Secondary orientation 

C5DP 

Dew point [°C] 

C8GN 

Antenna gain 

C2XPC 

Az/RA pointing correction 
[arcsec] 

C5IR 

Refractive index 

CllVD 

Phase table names 

C2YPC 

El/Dec pointing correction 
[arcsec] 

C5MM 

Atmospheric vapor pres¬ 
sure [mm] 

C12BW 

Bandwidth 

C3CL 

Length of cycle [sec] 

C5PRS 

Atmospheric pressure 

[mm Hg] 

C12CF 

Observed frequency 

[MHz] 

C3DAT 

UT date (YYYY.MMDD 
format) 

C5RH 

Relative humidity [%] 

C12CT 

Calibration temperature 
[K] 

C3LST 

LST at start [hours] 

C6DX 

Delta X offset [arcsec] 

C12FR 

Frequency resolution 

[MHz] 

C3NRC 

Number of rx/backend 
channels 

C6DY 

Delta Y offset [arcsec] 

C12Rr 

Rest frequency [MHz] 

C3NSV 

Number of switch¬ 

ing/phase table variables 

C6FC 

Reference frame coord 

code 

C12RST 

Reference system tempera¬ 
ture [K] 

C3PPC 

Number of phases per cy- 

C6MSA 

Scanning angle [deg] 

C12RT 

Receiver temperature [K] 

C3SRT 

Integration time [sec] 

C6NP 

Number of grid points 

C12SST 

Source system temperature 
[K] 

Water opacity 

C3UT 

C4AZ 

UT of observation [hours] 
Azimuth at C3UT [deg] 

C6XGC 

C6XNP 

Starting X grid position 
Number of X grid points 

C12WO 


C2UXP C2UYP are the user Az/RA and El/Dec pointing correc¬ 
tions in arcsec but at JCMT these were simply called UAZ 
and UEL with no class prehx and no RA/Dec equivalent. 

C4D0 is a three-element array labeled “Descriptive Origin” de¬ 
scribing the position and angle of the coordinate system 
dehned by the observer. At JCMT this was implemented 
as three distinct items C4DD1 through C4DD3 and speci- 
hed the observing cell size and position angle with respect 
to local vertical. There was disagreement between NRAO 
and JCMT on the definition here as the three elements at 
NRAO referred to the horizontal and vertical position and 
the position angle with respect to the horizontal axis. Doc¬ 


uments and source code from JCMT indicate these items 
were not used and are duplicates of items C6DX, C6DY and 
C6MSA. 

C4IX C4IY are the coordinates of the telescope as measured by 
the encoders. This information was not recorded by JCMT. 

C6XZ C6YZ specify the position of the map origin. These co¬ 
ordinates are not stored at JCMT as the map is defined in 
terms of offsets from the specihed tracking centre. 

C7FW is the beam full width at half maximum in arcsec at 
NRAO but at JCMT the item used is C7HP and most JCMT 
data files do not seem to set it. 
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Table 3: JCMT-specific keywords from class 1 to class 4. This includes the three items that were not allocated a class. 


CELL_V2Y 

Position angle of cell y axis (CCW) 

C3DASSHFTFRAC 

DAS calibration source for backend 
calibration (POWER or DATA) 

UAZ 

User az correction 

C3FLY 

Data taken on the fly or in discrete 
mode? 

UEL 

User el correction 

C3FOCUS 

Eocus observation? 

CIBTYP 

Type of backend 

C3INTT 

Scan integration time 

CIFTYP 

Type of frontend 

C3LSPC 

Number of channels per backend 
section 

CIHGT 

Height of telescope above sea level 

C3MAP 

Map observation? 

CUES 

Name of the IF device 

C3MXP 

Maximum number of map points 
done in a phase 

CILAT 

Geodetic latitude of telescope 
(North +ve) 

C3NCH 

Number of backend output channels 

CILONG 

Geographical longitude of telescope 
(West +ve) 

C3NCI 

Maximum number of cycles in the 

scan 

ClONAl 

Name of the support scientist 

C3NCP 

Total number of xy positions ob¬ 
served during a cycle 

C10NA2 

Name of the telescope operator 

C3NCYCLE 

Number of cycles done in the scan 

ClSNAl 

Source name part 1 

C3NFOC 

Number of frontend output channels 

C1SNA2 

Source name part 2 or altern. name 

C3NIS 

Number of scans 

C2PC1 

Angle by which lower axis is north 
of ideal 

C3NLOOPS 

Number of scans per observation 
commanded at observation start 

C2PC2 

Angle by which lower axis is east of 
ideal 

C3NMAP 

Number of map points 

C2PC3 

Angle by which upper axis is not 
perpendicular to lower 

C3NOIFPBES 

Number of IE inputs to each section 
(2 for correlator, 1 for AOS) 

C2PC4 

Angle by which beam is not perpen¬ 
dicular to upper axis 

C3NO_SCAN_VARSl 

Number of scan table 1 variables 

C3BEFENULO 

Copy of frontend LO frequency per 
backend section 

C3N02SCAN_VARS2 

Number of scan table 2 variables 

C3BEFESB 

Copy of frontend sideband sign per 
backend section 

C3NPP 

Number of dimension in the map ta¬ 
ble 

C3BEINCON 

IF output channels connected to BE 
input channels 

C3NRS 

Number of backend sections 

C3BESCONN 

BE input channels connected to this 
section 

C3NSAMPLE 

Number of scans done 

C3BESSPEC 

Subsystem nr to which each back¬ 
end section belongs. 

C30VERLAP 

Subband overlap 

C3BETOTIF 

Total IE per backend section 

C3UT1C 

UTl-UTC correction interpolated 
from time service telex (in days) 

C3CAL 

Calibration observation? 

C4AMPL_EW 

Secondary mirror chopping ampli¬ 
tude parallel to lower axis 

C3CEN 

Centre moves between scans? 

C4AMPLJVS 

Secondary mirror chopping ampli¬ 
tude parallel to upper axis 

C3CONFIGNR 

Backend configuration 

C4AXY 

Angle between cell y axis and x-axis 
(CCW) 

C3DASCALSRC 

DAS calibration source for backend 
calibration (POWER or DATA) 

C4AZERR 

DAZ:Net Az offset at start 
(inc.tracker ball setting and user 
correction) 

C3DASOUTPUT 

Description of output in DAS DATA 
(SPECTRUM, T_REC, T_SYS, etc.) 

C4CECO 

Centre coords. AZ=1; EQ=3; 

RD=4; RB=6; RJ=7; GA=8 
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Table 4: JCMT-specific keywords from class 4 to class 7. 


C4DECDATE 

Declination of date 

C4THROW 

Secondary mirror chop throw 

C4DEL 

Telescope upper axis correction for 
secondary mirror XYZ 

C4X 

Secondary mirror absolute X posi¬ 
tion at observation start 

C4D01 

Cell X dimension; descriptive origin 
item 1 

C4Y 

Secondary mirror absolute Y posi¬ 
tion at observation start 

C4D02 

Cell y dimension; descriptive origin 
item 2 

C4Z 

Secondary mirror absolute Z posi¬ 
tion at observation start 

C4D03 

Angle by which the cell x axis is ori¬ 
ented with respect to local vertical 

C5IR1 

Refraction constant A 

C4EDEC 

Declination of source for EPOCH 

C5IR2 

Refraction constant B 

C4EDEC2000 

Declination J2000 

C5IR3 

Refraction constant C 

C4ELERR 

DEL;Net El offset at start 
(inc.tracker ball setting and user 
correction) 

C6CYCLREV 

Cycle reversal flag 

C4EPT 

Type of epoch, JULIAN, 

BESSELIAN or APPARENT 

C6MODE 

Observation mode 

C4EW_ENCODER 

Secondary mirror ew encoder value 

C6REV 

Map rows scanned in alternate di¬ 
rections? 

C4EW_SCALE 

Secondary mirror ew chop scale 

C6SD 

Map rows are in X (horizontal) or 
Y(vertical) direction 

C4FRQ 

Secondary mirror chopping period 

C6ST 

Type of observation 

C4FUN 

Secondary mirror chopping wave¬ 
form 

C6XPOS 

In first row x increases (TRUE) or 
decreases (EALSE) 

C4LSC 

Char. code for local x-y co¬ 

ord.system 

C6YPOS 

In first row y increases (TRUE) or 
decreases (EALSE) 

C4MCF 

Centre moving flag (solar system 
object) 

C7AP 

Aperture 

C4MOCO 

Mounting of telescope; defined as 
LOWER/UPPER axes, e.g; AZ/ALT 

C7FIL 

Eilter 

C4NS_ENCODER 

Secondary mirror ns encoder value 

C7HP 

EWHM of the beam profile (mean) 

C4NS SCALE 

Secondary mirror ns chop scale 

C7NIF 

Number of IE channels 

C40DC0 

Units of cell and mapping coordi¬ 
nates ;offset definition code 

C7PHASE 

Lockin phase 

C40FFS_EW 

Secondary mirror offset parallel to 
lower axis (East-West Tilt) 

C7SEEING 

Seeing at JCMT 

C40FFS NS 

Secondary mirror offset parallel to 
upper axis (North-South Tilt) 

C7SEETIME 

SAO seeing time (YYMMD- 
DHHMM) 

C4PER 

Secondary mirror chopping period 

C7SNSTVTY 

Lockin sensitivity in scale range 
units 

C4POSANG 

Secondary mirror chop position an¬ 
gle 

Right ascension J2000 

C7SNTVTYRG 

Sensitivity range of lockin 

C4RA2000 

C7SZVRAD 

Number of elements of vradial array 

C4RADATE 

Right Ascension of date 

C7TAU225 

CSO tau at 225GHz 

C4SM 

Secondary mirror is chopping 

C7TAURMS 

CSO tau rms 

C4SMCO 

Secondary mirror chopping coordi¬ 
nate system 

C7TAUTIME 

CSO tau time (YYMMDDHHMM) 


Cll* Most sub-mm telescope divide an “observation cycle” 
into a series of “phases”, where the separate phases rep¬ 
resent different states (for example, on-source, off-source, 
cal-diode on, cal-diode off). The relevant information is 
stoed in class 11, the “phase block”. At JCMT CllVD 
specifies the names of the columns of the phase table in¬ 


formation stored in CllPHA where the dimensionality is 
specified by C3NSV (number of phase table variables) and 
C3PPC (number of phases per cycle). NRAO use CllVV to 
store the values of a single switch state and the phase table 
is CllPHT. 
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Table 5: JCMT-specific keywords from class 7 to class 14. Class 55 is not included here as that data was not archived. 


C7VREF 

Velocity reference code; reference 
point for telescope & source veloc¬ 
ity 

Phase table: switching scheme de¬ 
pendent 

C12SCAN_VARS2 

Names of the cols, of scan table2 

CllPHA 

C12TAMB 

Ambient load temperature 

C12ALPHA 

Ratio of signal sideband to image 
sideband sky transmission 

C12TASKY 

Ratio of signal sideband to image 
sideband sky transmission 

C12BM 

Correlation bit mode 

C12TCOLD 

Cold load temperature 

C12CAL 

Units of spectrum data 

C12TSKY 

Sky temperature at last calibration 

C12CALTASK 

Calibration instrument used (FE, 
BE, or USER) 

C12TSKYIM 

Erontend-derived Tsky, image side¬ 
band 

C12CALTYPE 

Type of calibration (THREE- 
LOADS or TWOLOADS) 

C12TSYSIM 

Erontend-derived Tsys, image side¬ 
band 

C12CM 

Correlation function mode 

C12TTEL 

Telescope temp, from last skydip 

C12ETASKY 

Sky transmission from last calibra¬ 
tion 

C12VCOLD 

IE V.COLD 

C12ETASKYIM 

Erontend-derived sky transmission 

C12VDEF 

Velocity definition code - radio, op¬ 
tical, or relativistic 

C12ETATEL 

Telescope transmission 

C12VHOT 

IE V HOT 

C12GAINS 

Gain value (kelvins per volt or 
equivalent) 

C12VREF 

Velocity frame of reference - LSR, 
Bary-, Helio-, or Geo- centric 

C12GNORM 

Data normalisation factor 

C12VSKY 

IE V_SKY 

C12GREC 

Raw data units per Kelvin 

C13DAT 

Reduced photometric value or Spec¬ 
trum data or Reduced data 

C12GS 

Normalizes signal sideband gain 

C13ERR 

Standard error 

C12INFREQ 

BE input frequencies [GHz] 

C13RAW_ERROR 

Raw error is accumulated over the 
scan, so store at end scan 

C12NOI 

Noise value 

C13RAW_ERROR_OP 

Raw (out of phase) error also to be 
stored at end scan 

C12REDMODE 

Way of calibrating the data (RATIO 
or DIEEERENCE) 

C13RESP 

array of responsivities 

C12SBRAT 

Sideband ratio 

C13SPV 

Individual beam integrations or Raw 
data 

C12SCAN TABLE 1 

Begin scan table 

C13SPV OP 

Raw out of phase data samples in 
each phase 

C12SCAN_TABLEJ2 

End scan table 

C13STD 

Phase data standard deviation 

C12SCAN_VARS1 

Names of the cols, of scan table 1 

C14PHIST 

List of xy offsets for each scan 


C12IT is the total time spent collecting data, including any 
blanking time. This item was not used at JCMT. 

C12NI C12SPN indicate the number of integrations (or chan¬ 
nels for spectral line data) and the starting point (channel) 
in the data vector. UniPOPS used this information to limit 
display and processing to a sub-set of the data array and 
to associate those limits with the data on disk. JCMT data 
did not need these quantities for any similar purpose. 

C12ST C12RMS are the computed source temperature and the 
RMS value. The JCMT online observing system did not 
calculate these. 

C12SP is a description of the polarization type and angle en¬ 


coded in an eight character field. This item was not used 
at JCMT. 

C12RP Cl2X0 C12DX These give the reference channel, X 
value at the reference channel, and spacing along the X 
axis. For spectral line data, the X axis is velocity at each 
channel and for continuum data this is position along the 
direction of telescope motion for each continuum integra¬ 
tion in that scan. These items were not used at JCMT. 

C12WT is the water temperature. Not measured directly at 
JCMT during this period. 

C1200 C120T is the oxygen opacity and temperature. Not mea¬ 
sured at JCMT. 


6 




Table 6: Items of the GSDD model only defined for use at NRAO. 


CIDLN 

Length of Data [bytes] 

CIHLN 

Length of Header [bytes] 

CISNA 

Source Name 

C2PC 

Pointing Constants(4) 

C2UXP 

User Az/RA Pointing Correction [arcsec] 

C2UYP 

User El/Dec Pointing Correction [arcsec] 

C4DO 

Descriptive Origin(3) 

C4IX 

Indicated X Position [deg] 

C4IY 

Indicated Y Position [deg] 

C6XZ 

X Position at Map Reference Position Zero 
[deg] 

C6YZ 

Y Position at Map Reference Position Zero 
[deg] 

C7FW 

Beam Eullwidth at Half Maximum [arcsec] 

CllTP 

Phase Table 

CllVV 

Variable Value 

C12DX 

Delta X 

C12IT 

Total Integration Time [sec] 

C12NI 

Number of Integrations 

C1200 

02 Opacity 

C120T 

02 Temperature [K] 

C12RMS 

RMS of Mean 

C12RP 

Reference Point Number 

C12SP 

Polarization 

C12SPN 

Starting Point Number 

C12ST 

Source Temperature 

C12WT 

H20 Temperature [K] 

C12X0 

X Value at the Reference Point 


The following items are found in both implementations. 

Some discrepancies are noted as follows. 

CIDP This is used to specify the precision and data type used to 
store the instrument data. This was used in early variants 
of the JCMT system but was later dropped due to the data 
format being able to report the data type associated with 
each item. 

CIDNA JCMT used this as a synonym for CIOBS and instead 
added ClONAl to indicate the name of the support scientist 
for the observing ran and C10NA2 to indicate the name of 
the telescope operator. 

CISTC specifies the type of observation. At NRAO this was 
defined as two 4 character strings defining the type of data 
and the observing mode. For example LINEPSSW for a 
position-switched spectral line observation. JCMT used 
this item solely to define the switching mode (position- 
switched, beam switch, frequency switch and no switch), 
preferring instead to use CIFTYP to specify the frontend 
type (heterodyne versus bolometer) and CIBTYP to indi¬ 
cate the backend type (line versus continuum). In later 
versions JCMT dropped CISTC completely, preferring to 
specify the switching mode explicitly in C6M0DE. 


Table 7: Coordinate codes used at NRAO and JCMT for item C4CSC. 



NRAO 

JCMT 

Galactic 

GALACTIC 

GA 

B1950 RA/Dec 

1950RADC 

RB 

Epoch RA/Dec 

EPOCRADC 

RD 

Mean RA/Dec 

MEANRADC 

- 

Apparent RA/Dec 

APPRADC 

- 

Apparent HA/Dec 

APPHADC 

EQ 

1950 Ecliptic 

1950ECL 

EC 

Epoch ecliptic 

EPOCECL 

- 

Apparent ecliptic 

APPECL 

- 

Azimuth/Elevation 

AZEL 

AZ 

User defined 

USERDEE 

UD 

J2000 RA/Dec 

2000RADC 

RJ 

Indicated Ra/Dec 

INDRADC 

- 


C3UT is the Universal Time in decimal hours, yet at JCMT it 
was decided that this should refer to UTl. 

C4CSC The JCMT coordinate system codes (Kenderdine, 
1985) were a two-character code such as RB to indicate 
B1950 RA/Dec. NRAO used a completely distinct set 
of codes using eight characters; RB being equivalent to 
1950RADC. The full list is shown in Table 7. 

C5IR Whilst JCMT did use C5IR to report the mean refractive 
index, the JCMT implementation also stored the three re¬ 
fraction constants defined in the JCMT refraction model 
(Kenderdine et ah, 1988) as C5IR1, C5IR2 and C5IR3. 

C6FC is the coordinate frame to use when offsetting, which al¬ 
lows the offset system to be distinct from the telescope 
tracking centre. At NRAO this was an eight character 
string made up of two four character components (polar 
versus cartesian and step versus scanning). At JCMT this 
item was an integer indicating which coordinate frame 
should be used with options of AZ=1, EQ=3, RD=4, 
RB=6, RJ=7 and GA=8 (using the same definition ex¬ 
plained in item C4CSC). 

C7VRD is defined as the velocity definition and reference at 
NRAO, by combining two four character strings into a 
single item. It describes how the source radial veloc¬ 
ity, C7VRD, should be intrepreted. The allowed veloc¬ 
ity definitions were RADI (radio), OPTL (optical) and 
RELV (relativistic). The velocity reference was allowed to 
be LSR (Local Standard of Rest), HELO (Heliocentric), 
EART (earth), BARI (barycentre) and OBS (observer). 
At JCMT this item was reserved entirely for the velocity 
definition but deprecated in later versions. The velocity 
definition was later defined in C12VDEF (allowed values 
being RADIO, OPTICAL and RELATIVISTIC) and the 
standard of rest indicated in C12VREF (allowed values be¬ 
ing TOPO(centric), LSR, HELI(ocentric), GEO(centric), 
BARY(centric) and TELL(uric)). 
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3. National Radio Astronomy Observatory 

The 12-m Telescope was upgraded to write GSDD format 
data in the summer of 1986 (Brown and Stobie, 1986; Stobie, 
1987); requiring that the data analysis system was also updated 
to understand it. 

In 1988 the NRAO decided for a number of reasons to unify 
the data reduction systems for its single-dish telescopes; the 
Tucson 12-m, and the Green Bank 300 ft and 140 ft telescopes. 
At the time all three telescopes used what looked like a very 
similar data reduction system, the People Oriented Parsing Ser¬ 
vice (POPS; Hudson, 1982) But, at the code level the appli¬ 
cations in Green Bank and Tucson had been diverging rapidly 
since the early 1980’s, essentially due to the different com¬ 
puter architectures at the two sites (early 1970’s Modcomps 
in Green Bank and mid-1980 DEC VAX’s in Tucson). The 
NRAO wanted to reduce maintenance costs as different staff 
were needed to maintain and develop each version. The NRAO 
was also migrating to Unix-based (primarily Sun) computers, 
a change that would require major modifications to POPS. 
The unified analysis system, UniPOPS (Salter et al., 1995, 
ascl; 1503.007), was started in early 1989 and first released to 
users in early 1991 (vanden Bout, 1991). Although the 300 ft 
collapsed in 1988 (vanden Bout, 1990), and the 140 ft was 
decommissioned for routine general-user astronomy in 1999, 
UniPOPS is still in use today at some level by the University 
of Arizona who took over the running of the 12-m telescope in 
2004. 

Since the majority of the FORTRAN code that was modified 
to create UniPOPS came from the 12-m version of POPS, the 
UniPOPS developers decided that UniPOPS would also inherit 
with little modification the underlying data structure and export 
formats of the 12-m version of POPS. Internally, the UniPOPS 
data structure used to hold the data is the same as the 12-m ver¬ 
sion of the GSDD data model with additional items added as 
described in section 2 and a new Class 9 to hold the values that 
were unique to to the Green Bank telescopes. The UniPOPS 
file format is nearly identical to the POPS Data File (PDFF) 
format in use at the 12-m prior to UniPOPS. This data struc¬ 
ture and file format were used by UniPOPS to hold data at all 
stages of processing (raw, calibated, averaged, smoothed, etc.). 
Adapting 140 ft and 300 ft data to use the GSDD data model 
was relatively easy, good evidence that GSDD was indeed a 
rather versatile and useful standard. Additional details on the 
export file format used by UniPOPS and the format of the raw 
data written at each NRAO telescope are provided in the next 
section. 

Two modifications were made to the PDFF files when they 
were incorporated into UniPOPS, solely to boost the perfor¬ 
mance of the system. The binary representation was changed 
from that of the DEC architecture to that of Sun workstations. 
And, the index that was at the start of a PDFF file was extended 
to include such items as the sky location and observing fre¬ 
quency to expand the items that could be efficiently searched in 
UniPOPS. To distinguish the UniPOPS Sun-specific exported 
files from VAX PDFF files, the NRAO developers changed the 
name of the export format to Single Dish Data (SDD) format. 


Bootstrap 

# Records in index 

# Data records following index 
Bytes per record 

# bytes per index entry 

# index entries used 
Update counter 
Type of SDD file 
Version 

Index 

Start record # for current scan 
Last record for current scan 
Horizontal and Vertical coordinates 
Source name and scan number 
Freq resolution or slew rate 
Rest frequency or integration time 
LST 

Observing mode 
Record + phase 
Position code 

Scans 


Figure 1: Layout of a UniPOPS data file. The concepts are similar to those used 
in the GSD format (Fig. 2). The bootstrap field describes the basic layout of the 
file and the index indicates where each of the scans are located in the file. A 
key difference between GSD and SDD is that GSD contains a single observation 
whereas an SDD file contains many observations for a single science program. 

Other than a modification that expanded the capabilities of the 
index section of the NRAO SDD files, the SDD format adopted 
for UniPOPS (Fig. 1) remained unchanged until UniPOPS was 
retired at the NRAO in the mid-2000’s. 

By the late 1980’s, users of the NRAO telescopes were very 
interested in seeing a FITS format implemented for the NRAO’s 
single-dish telescopes (see § 5.7). By the mid 1990’s, UniPOPS 
could export and import data in Single Dish FITS (SDFITS) and 
SDD formats, as well as many of the historical NRAO formats. 
The NRAO found that very few users went away with SDFITs 
format; most took home SDD files. Since users were installing 
UniPOPs on their home computers, they probably found trans¬ 
porting SDD files more convenient than using SDFITS files. It 
was probably very rare that a UniPOPS SDD file was imported 
into another analysis system. For example, a separate utility 
was developed that would prepare data files that could be im¬ 
ported into the class package. Furthermore, when SDFITs was 
released, few FITS readers at the time could actually usefully 
import binary tables. Thus, we suspect that frequent observers 
grew into the habit of avoiding SDFITS files. 

3.1. SDD File Format 

The layout of an SDD file is shown in Fig. 1 (see also Salter 
et al., 1995). The file consists of 3 parts: a bootstrap record, 
the index, and the data. An SDD file has an integer number of 
records where the size of a record is given in the bootstrap. Each 
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section (index and data) is also an integer number of records. 
Within the data section, each individual “scan” is one instance 
of the data structure that evolved from the original GSDD data 
model. Each scan occupies an integer number of records within 
the hie (any extra space is padded with zeros). An SDD hie can 
hold both spectral line and continuum data. The type of data is 
indicated by the CISTC value found in the data for each scan. 
The order and type of the values in the bootstrap record and 
within each index entry are set by the version number recorded 
in the bootstrap. The update counter item in the bootstrap 
record was an integer that was incremented each time the hie 
was modihed. This was used at the 12-m where the telescope 
control system was writing to multiple SDD hies while one or 
more running UniPOPS sessions were reading from the same 
set of SDD hies. This is not a self-describing format like FITS. 

The bootstrap record contains information used to read the 
index records. The size of the index entry and record were 
chosen so that there are an integer number of index entries in 
each record without any extra space. The index record is read 
into memory in UniPOPS when an SDD hie is opened and the 
copy in memory is kept in sync with the contents of the hie as 
changes are made. Data selection in UniPOPS only uses the 
helds in the index. Every scan in the hie must have an entry 
in the index section. Empty index entries have zeros for the 
start and last record numbers. The current largest index number 
in use is indicated in the bootstrap. If the data query involved 
the index associated with one of the SDD hies being written by 
the 12-m during observing, then the update counter value in 
the bootstrap record on disk was checked. When a change was 
seen in that value then the copy of the index in memory was 
regenerated from the data hie before the data query was done. 

Each index entry indicates where the associated scan data 
starts and ends. The scan data consists of a preamble, which is 
16 short integers giving the number of classes and the starting 
8-byte location in the header where each class of header words 
started. Within each class, the type and order of each value is 
hxed. The CIHLN value gives the length of all of the header 
values. Over time, new items were added to the ends of some 
classes so that UniPOPS retained the ability to read previous 
versions of the SDD format by not attempting to read header 
words past the end of a class as indicated by the values in the 
preamble and CIHLN. These new items are the differences be¬ 
tween the NRAO and JCMT versions of the GSDD data model 
mentioned in section 2. Those differences grew over time as 
needed by NRAO to accomodate new instruments, observing 
techniques and reduction methods. All numerical values in the 
header are stored as 8-byte floats. All string values are stored 
as multiples of 8 characters, depending on the specihc value. 
The data vector immediately follows the header. The data are 
always 4-byte floats. 

In order to accomodate spectral line and continuum data 
within the same structure and minimize the amount of space 
needed to store the associated header, some header values have 
2 meanings depending on the type of data. This is most obvi¬ 
ous in class 12, where the associated X axis is described. For 
spectral line data the X axis is the frequency or velocity at each 
channel. For continuum data, the data vector is a series of regu¬ 


larly sampled data (each sample is one integration) so the X axis 
is related to the position on the sky as the telescope is slewed. 
UniPOPS was started either in spectral line mode or continuum 
mode and would need to be restarted to switch modes. It was 
never possible to work on both continuum data and spectral line 
data within the same session of UniPOPS so typically a single 
SDD hie only contained one type of data although that was not 
required by the hie format. 

3.2. SDD Usage 

UniPOPS dealt directly only with SDD format hies. An SDD 
hie could contain raw data, individual integrations and cali¬ 
brated data. When writing to an SDD hie, UniPOPS could 
extend that hie by appending to the end or it could overwrite 
existing data in the hie provided that the size of the scan being 
overwritten was as least as large as the scan being written. In 
either case, the appropriate index entried was updated. In the 
case of appending to the hie, the next index location after the 
current end as indicated in the bootstrap record was used. The 
index entries do not need to rehect the order that the data ap¬ 
pear in the hie, although that typically is the case. If a user tried 
to overwrite a scan with a scan with more channels UniPOPS 
would append the new scan to the hie, replace the index entry 
for the original scan with an appropriate index entry for the new 
scan, and replace the original scan records in the SDD hie with 
zeros. 

UniPOPS provided observers with access to their raw data in 
near real-time. For the 12-m this was direct access to the current 
set of SDD hies being written by the telescope control software. 
Multiple hies could be written at the same time, depending on 
the backend, and UniPOPS could access the desired data from 
any of those hies while the data was being taken. For long ob¬ 
serving sessions multiple versions of each backend-specihc hie 
were written. The 12-m also provided SDD hies containing sys¬ 
tem temperature across the bandpass for each scan. UniPOPS 
provided separate methods for accessing that calibration data 
but the hie format was identical to all other SDD hies. For the 
140 ft, the raw data was written in the original telescope format 
produced by the Modcomps. This raw 140 ft telescope format 
predates the GSDD data model. A conversion step to the NRAO 
version of the GSDD data model was necessary for UniPOPS 
to use that data. While observing, that conversion step hap¬ 
pened on demand within UniPOPS. Access to the raw 140 ft 
data within UniPOPS could be done remotely by an observer 
running UniPOPS at their home institution. The UniPOPS user 
could then choose to save that raw 140 ft data directly to disk 
in an SDD format hie or they could process the data and only 
save those scans to disk. A separate data conversion tool was 
also provided to convert an entire observing session at the 140 
ft from raw telescope format data to an SDD format hie which 
could be read directly by UniPOPS without any network con¬ 
nection to the raw data. 

SDD format hies could be used interchangeably for input and 
output by UniPOPS. Typically a raw, uncalibrated data set was 
used as input and the user would save processed spectra to a 
separate SDD hie. UniPOPS users could choose to save their 
data to disk at any stage of processing. Any single SDD hie 
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could contain raw, calibrated, or reduced data in any combina¬ 
tion. Typically most users kept the raw data separate from the 
processed data as that made it simpler to keep track of what had 
been done. A single output SDD file often contained the same 
data at different processing steps. The UniPOPS user needed to 
keep track of what had been done to the data as no processing 
history information was associated with data either internally 
or in the SDD file. 

With the interactive UniPOPS environment, users had the 
ability to modify any of the GSDD data model items (header 
values) for any scan. These header values were referenced by 
the UniPOPS interpreter using slightly more readable names 
(e.g. CISNA is OBJECT in UniPOPS). The UniPOPS Cookbook 
(Salter et al., 1995) uses those more readable names to reference 
the GSDD data model items. Internally, the compiled code that 
comprises UniPOPS (mostly fortran) uses the original GSDD 
data model names (known at the JCMT as the “NRAO” names). 

The number of scans that an SDD file can contain is set 
by the size of the index section. Scripts were provided with 
UniPOPS to expand an existing SDD file if more index space 
was necessary. UniPOPS could not read or write SDFITS di¬ 
rectly. Separate conversion tools were necessary to produce and 
consume SDFITS. Conversion tools were also provided for his¬ 
torical NRAO formats including the PDFL format used at the 
12-m prior to UniPOPS. 

Archives from both the Green Bank 140 ft and Tucson 12- 
m telescopes exist. For the 12-m, there are about 200 GB of 
archived SDD format files. The archive from the 140 ft con¬ 
sists entirely of telescope format files. The current Green Bank 
single dish analysis package, GBTIDL (Marganian et al., 2006, 
ascl;1303.019)), can read archived SDD files. GBTIDL uses 
SDFITS as it’s primary data format. 

4. James Clerk Maxwell Telescope 

4.1. Requirements 

During the development of the JCMT software libraries at 
the Mullard Radio Astronomy Observatory, a number of op¬ 
tions were considered for the raw data file format. Two obvi¬ 
ous options were available in the astronomical community in 
the form of the Flexible Image Transport System (FITS; Wells 
et al., 1981) and the Starlink Hierarchical Data System (HDS; 
Disney and Wallace, 1982; Jenness, 2015, ascl: 1502.009). 

FITS was discounted as the primary data format because of 
the large amount of overhead required to format the header in¬ 
formation when writing files and the inability of the format (at 
that time) to store more than one data array or table in a file. 
FITS files at the time were not capable of storing binary tables 
and ASCII tables were all that was possible (Harten et al., 1988) 
and those were not standardised until 1987. It was also felt that 
the DEC Backup Utility was more reliable for transport and 
archiving than using a specialist FITS tape format. Whilst the 
FITS community would eventually support multiple data arrays 
(Grosbpl et al., 1988) and binary tables (Cotton et al., 1995), it 
was not possible to wait for that to happen. 


HDS was discarded for I/O efficiency reasons and the inabil¬ 
ity for the entire file to be mapped into memory in one opera¬ 
tion. Additionally it was felt that the HDS library API required 
too many calls to do simple tasks, and although these calls could 
be wrapped in higher level subroutines, the overhead associated 
with the many lower level calls would be too high. One further 
option was to use the NRAO 12-m file format (PDFL) but that 
also suffered (from the JCMT perspective) from serious I/O is¬ 
sues and could not be used on the acquisition hardware initially 
targeted for JCMT. 

The computer used during testing and commissioning in 
1985/86 was a VAX 11/730 with 4 MB of RAM and which had 
severe performance limitations. This was upgraded to a Mi¬ 
cro VAX with 16 MB of RAM just before operations started at 
JCMT in 1987 but performance was the key design driver: the 
control system was required to minimize the overheads in data 
capture and therefore maximize the observing time. The VAX 
Record Management System (RMS) was the basis of all stan¬ 
dard VAX records-based file handling. The performance of this 
system was not suitable for real-time operation as it was not 
acceptable for the system to pause while opening or closing or 
extending a file in the middle of the data collection. Further¬ 
more, limits on the maximum record length in RMS meant that 
additional complexity would be required when writing out data 
from long observations. The JCMT disk I/O approach was in¬ 
stead designed to utilize the VAX System Services library that 
allowed a program to map a section of virtual memory (referred 
to as a Global Section) and then manage the scalar and array 
data in that memory directly in the program. This was very fast 
and did not cause the problems encountered with RMS. Perfor¬ 
mance benchmarks on a VAX 750 (Fairclough, 1988) suggested 
that I/O operations using RMS were approximately five times 
slower than using a Global Section. The use of a Global Sec¬ 
tion also allowed other applications read access to the contents 
of the file whilst it was being written and also meant that the 
data already acquired would be usable even if the acqusition 
software crashed mid-observation. 

These requirements led to a new disk format being devised 
and an associated I/O library written which used the GSDD data 
model, but used Global Sections for writing to disk. This led to 
the JCMT implementation of the library being known as the 
Global Section Datafile System (GSD; Fairclough, 1988)®. The 
file format design was influenced by the NRAO idea of a self¬ 
describing GSDD implementation and also the concept of an 
“in memory data base management system”^ from the MON li¬ 
brary being used in the JCMT control system.* JCMT adopted 


®In retrospect, the similarity of acronyms between GSD and GSDD - two 
quite separate concepts - was rather unfortunate. The naming of the librai^ 
as GSD eventually led to JCMT users referring to the files as being of “GSD 
format” and it being assumed that “GSDD format” was an historical artefact. 

database system designed to work entirely in memory rather than 
requiring lots of disk I/O. See also http://en.wikipedia.org/wiki/ 
In-memory_database. 

^The MON library was a shared memory system, based on Global Sections, 
in use at the JCMT to allow the individual control system tasks to easily share 
state information. It was the precursor to the Noticeboard System (NBS; Lupton 
et al., 1995). 
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File Description 

Version 

Maximum number of items 

Number of items 

Start and end of data segment 

Comment 

File Size 


Item Descriptors 

Array item? 

Name (and length of name) 

Unit string (and length of unit) 

Data type 

Location in data segment 
Number of bytes in data segment 
Number of dimensions 
Dimensions (by scalar item reference) 


Data 


Figure 2: Layout of a JCMT GSD data file. The file descriptor indicates where 
the data starts and the number of items in the data. The item descriptors describe 
each of those items and where they are located in the data segment. The size of 
each dimension in array items is defined in terms of other scalar items. The file 
was pre-allocated by the acquistion system at the start of the observation rather 
than being continually extended. 

the GSDD data model in the hope that downstream the data re¬ 
duction systems could be compatible through the shared meta¬ 
data conventions. 

Unlike the NRAO PDFL/SDD files which grow thi'oughout 
the night as more data are taken, a JCMT GSD file was only re¬ 
quired to store data from a single observation. At JCMT an ob¬ 
servation was defined as data being taken in a single switching 
mode at a single tracking position with a single instrument fron- 
tend/backend combination. A single observation could include 
multiple offsets in a grid or on-the-fly map and includes the full 
map area, rather than a single row or column. This approach 
resulted in more files to track in a night but was felt to sim¬ 
plify the acqusition software (each observation was completely 
independent of what had gone before), and make it easier to 
distribute subsets of a night’s data amongst different observers 
(a pre-requisite for flexible scheduling) and simplify queries for 
individual observations from the data archive. Of course, this 
meant that the data reduction packages had to do more work to 
collate related observations into a coherent data set as they now 
worked with many independent files rather than being able to 
treat a night’s observing as a single coherent entity. 

4.2. File Format Design 

The layout of a JCMT GSD format file is shown in Fig. 2 
(see also Fairclough et al., 1987). The file is split into three 
segments: the file descriptor, the item descriptors and the data 


itself. The file descriptor contains a general description of the 
file indicating its version, the number of items written and the 
start position of the data array. The item descriptors define each 
of the items in terms of the label and units and the position 
within the data array. The data itself is a single block at the end 
of the file following the item descriptions; the item descriptions 
having defined exactly where in the data array a relevant item 
is located and how many bytes in the data array it occupied. 

For array items (GSD supported up to 5 dimensions), the 
identity of each dimension is specified in terms of the number 
of a scalar item. This allows the label and unit to be associated 
with each dimension of an array item in addition to the size of 
the dimension. A negative number of dimensions indicates that 
an item is a scalar that defines an array dimension. For example, 
the CllPHA array entry in a JCMT DAS spectrum (Bos, 1986) 
is dimensioned according to the scalar items C3NSV, the num¬ 
ber of phase table variables, and C3PPC, the number of phases 
per cycle. The item descriptor for CllPHA would therefore con¬ 
tain a dimensions array of two elements containing the item 
numbers (position in the item descriptor section) for C3PPC and 
CllPHA. A library user would then look up those two items to 
determine the dimensionality of CllPHA. 

This file design resulted in a fully self-describing system 
where there was no requirement for items to be grouped by 
class in the file and no requirement for the order of items to be 
pre-determined (an issue for the NRAO implementation where 
the order was specified in a compiled include file requiring that 
the order of items within a class be preserved and also that new 
items could only be added to the end of a class). A user of 
the format could either request an item by number or request 
an item by name. Storing the units with the data also allowed 
for more flexibility in data model representation at the expense 
of more logic in the application code that might have to under¬ 
stand unit conversions. Application software would use the file 
version number to decide which variant of a data model was 
present in the file. At JCMT this became important as the sys¬ 
tem evolved in the first few years. ^ 

The JCMT format implementing GSDD supports the stan¬ 
dard Fortran data types of byte, word, logical, integer, real, 
double and character strings, and uses VAX floating point for¬ 
mat (see Payne and Bhandarkar, 1980, for more information on 
VAX floating point format). To simplify the format, character 
strings have a hxed size of 16 characters, item names are hxed 
at 15 characters and unit strings are fixed at 10 characters. The 
format supported the concept of a “null” value by reserving the 
most negative value of each data type for that purpose (using 
a single space as the null character value and false as the null 
logical value). Additionally, the JCMT GSD library supported 
data type conversion, allowing a user to request a value in a dif¬ 
ferent type to how it was stored natively in the file. This was an 
important aspect of the library interface, simplifying code re¬ 
quired by the reduction software, enabling users of the library 
to request data in the form most suitable for them. This feature 


^Version 4 of the JCMT data model was the first stable implementation, 
released late in 1988, and the final version was release 5.3. It is version 5.3 that 
is documented here. 
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was influenced by earlier work on the Starlink Catalog Access 
and Reporting (SCAR) relational database management system 
for astronomical catalog handling (Walker et ah, 1990).*° 

4.3. Format Usage 

The JCMT took data in the GSD format for all instruments 
(heterodyne and continuum) from the telescope commission¬ 
ing (circa 1986) to the delivery of SCUBA in 1996 (Holland 
et ah, 1999). The GSD format continued to be used for hetero¬ 
dyne instruments until the delivery of the new ACSIS correlator 
in 2006 (Buckle et ah, 2009). SCUBA and newer instruments 
wrote data in the Starlink extensible A-dimensional Data For¬ 
mat (NDF; Jenness et al., 2015), although SCUBA’s data model 
was not precisely copied for ACSIS and SCUBA-2 (Holland 
et al., 2013) data. NDF had a key advantage that it was being 
used throughout the Starlink Software Collection as the primary 
data format (Allan, 1992). Writing data using NDF meant that 
JCMT data files had immediate access to all the visualization 
and analysis applications already available to the community 
such as KAPPA (Currie and Berry, 2013, ascl: 1403.022). Many 
of the performance worries from the mid-1980s concerning the 
overhead associated with the HDS library were no longer rele¬ 
vant in the late 1990s. 

The GSD data access library was a VAX-specific library 
(Fairclough et al., 1987; Hewish et al., 1986) written in For¬ 
tran and making extensive use of VAX system calls. When the 
last instrument moved off of the VAX/VMS data acquisitions 
computers the format could no longer be used and was retired. 
There was little motivation to port the data model to the newer 
instruments as it was clear by this time that GSDD had not suc¬ 
ceeded and that NDF would be more useful to the JCMT user 
community despite the resulting necessity for new ways of de¬ 
scribing raw JCMT data. There was seen to be no advantage 
to moving the GSDD class and item names to the newer NDF- 
based raw data models. Indeed, as described in sec 5.1 the stan¬ 
dards effort was dead and it was not obvious to later users and 
software developers from where such opaque names had origi¬ 
nated. 

The GSDD data files are archived at the Canadian Astron¬ 
omy Data Centre and approximately 440 000 GSD format files 
are in the archive, totalling approximately 30 GB. In order to 
access these data files on a Unix system a new read-only ver¬ 
sion of the GSD library was written in C (Jenness et al., 1999, 
ascl; 1503.009) and integrated into the standard data reduction 
tools SPECX (Padman, 1990, 1993, ascl; 1310.008), COADD 
(Hughes, 1993, ascl; 1411.020) and JCMTDR (Lightfoot et al., 
2003, ascl; 1406.019). The GSD format is relatively simple and 
the main complication in the new C (and later pure Java) imple¬ 
mentations was the conversion of VAX floating point format to 
IEEE format. Eurthermore, computers were sufficiently more 
powerful by the time the Unix version was written that there 
was no need to use memory mapping; the entire contents of a 
file is read into memory. GSD was solely used as a data ac¬ 
quisition format at JCMT, with there being one application on 


*®HDS also supported automatic type conversion (Lupton, 1989). The au- 
thors are not sure when equivalent facilities were added to FITS I/O libraries. 


the VAX to enable the editing of contents if there was a need 
to fix some metadata. Data reduction applications never wrote 
data out in GSD format and the Unix port of the library did not 
have the ability to write a GSD file. A Perl interface to the Unix 
C GSD library (Jenness et al., 1999) was implemented to allow 
the preview of spectra for remote observers when doing flexible 
scheduling (Jenness et al., 1997). 

The GSD format files are no longer part of the publically 
available query system at the CADC. This was driven by fund¬ 
ing constraints when the CADC system was re-engineered to 
use a common internal data model (Redman and Dowler, 2013) 
and a requirement that federal interfaces be compliant with 
Canadian language regulations. The JCMT Science Archive 
(JSA; Economou et al., 2015) therefore does not contain GSD 
data. To extend the useful life of the GSD format observations 
and to make the observations available to the widest possible 
community through the JSA and the Virtual Observatory, there 
was a project to convert the GSD heterodyne files archived 
at CADC to the modern ACSIS format (Jenness et al., 2007) 
such that they can be processed (baseline subtracted, co-added, 
placed into data cubes) using the standard JCMT data reduc¬ 
tion pipelines (Jenness et al., 2008; Jenness et al., 2015). The 
SMURE data reduction application (ascl; 1310.007) contains 
the ability to read GSD files and migrate them to the modem 
format (Balfour, 2008). The GSD files from the earlier contin¬ 
uum instruments, such as UKT14 (Duncan et al., 1990), will 
remain in the archive although they will not be visible through 
the JSA interface. 


5. Retrospective 

GSDD has had a mixed history and in this section we look 
back on the good and bad of GSDD. 

5.1. The hidden standard 

The key failure of GSDD was that most of the developers 
and users of the format did not realize that it was a standard and 
therefore there was no impetus for the respective observatory 
staff to continue to communicate as systems evolved. The ini¬ 
tial developers of the JCMT system did not maintain the data 
acquisition software in Hawaii and, at NRAO, the lead devel¬ 
oper of the 12m GSDD system left NRAO before the end of 
the 1980s. Interviewing staff from NRAO and JCMT follow¬ 
ing the respective implementations of GSDD compatible sys¬ 
tems, it was very rare for anyone to remember that there was 
an intent for a standard to be in place. As can be seen from 
the evolution of the JCMT class names and the divergence of 
data models, items were added to the respective data formats 
without any communication between the nominal GSDD part¬ 
ners. 12-m development continued with tweaking of the ac¬ 
quisition and reduction formats independently. As the GSDD 
model evolved, the NRAO implementation resulted in 24 items 
that are not present in the JCMT implementation (not including 
the classes explicitly specified to be locally defined), and 154 
items that are defined by JCMT but not defined by NRAO. 
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The goal of unified data reduction software understanding 
GSDD never materialized. Indeed, interoperability usually oc¬ 
curred, if at all, by exporting the files into a completely different 
format that could be understood by class. 

In conclusion, it is impossible for a standard to survive as a 
standard if no-one knows they are using a standard; the effort 
must be made to broadcast and properly document the effort 
within the wider community as part of the original develop¬ 
ment. 

5.2. A model must define the values and units 

Whilst the data model provided a reasonable baseline for how 
to name items, it broke down almost immediately when it came 
to storing values in those items. For example, the coordinate 
codes, C4CSC, were not standardised, the reference frame coor¬ 
dinate code, C6FC, had a whole different concept at JCMT and 
NRAO and, indeed, the specification of how observing grids 
were defined at both observatories differed despite sharing the 
same underlying item names. If an attempt had ever been made 
to transfer data between observatories special code would have 
to be written to import the data, removing most of the gains of 
a shared model. The was due to a failure to fully develop the 
standard prior to starting its implementation. In some sense, 
the development/initiative was not initiated with/subjected to 
proper project management procedures as we currently under¬ 
stand them. 

5.3. Embrace Flexibility 

A major advantage of GSDD is that the standard actually al¬ 
lowed sites to alter the format and data model as they saw fit. 
NRAO sites using the NRAO file format had to follow some 
minor rules in order to guarantee that any other site’s GSDD 
reader could still manage the files. Such rules as: do not touch 
the pre-defined keywords (which were to have predefined byte 
sizes and were always to be in a certain order at the start of a 
class), you are free to add new keywords to any class but only 
at the end of the pre-defined section of each class, modify class 
9 for your particular telescope, modify class 10 as convenient, 
and be sure to use the well-defined pre-amble to designate the 
byte at which every class begins. We maintain that GSDD was 
actually a very good implementation for its time because these 
rules could be easily adhered to while simultaneously giving 
sufficient versatility to each telescope. The JCMT GSD file 
format encouraged far more flexibility than this since the con¬ 
straints on class keyword ordering were removed and software 
did not need to compile-in knowledge of where the individual 
items were meant to be located in the file. This led to much 
more explosive and dynamic modifications to the data model in 
the early years of telescope operations. 

5.4. Too much flexibility is not always good 

The alternative view is that allowing a class 9 for particular 
telescopes to use as they liked was an impediment to standard¬ 
ization. In many cases an item being added to class 9 could 
have been made generically useful with some discussion or may 
well have been very similar to an item already in use by another 


telescope. The use of the escape hatch class should have been 
treated as a last resort after debate within the community. Only 
when it was determined that a particular item was unique for a 
telescope should class 9 have been used, and even then a case 
could be made that it would still be more helpful for the item to 
have been placed in the correct class and documented as such, 
to help the next telescope that required similar functionality. 
In some sense this was the approach used at JCMT (without 
the communication effort) which was simply to ignore class 9 
completely and add items to the “correct” classes without dis¬ 
cussion in the wider community. As the JCMT model evolved 
it was soon clear that many of the items were not relevant to 
particular observing modes. Rather than attempting to always 
write them out regardless, it was decided to treat them as true 
optional items. This difference between JCMT and NRAO may 
have been driven by file format design given the difference in 
approach between the self-describing GSD and the more stati¬ 
cally defined PDFL. In retrospect it would have been better to 
attempt to standardize even at the expense of having to spend 
more time in discussion. 

5.5. Clear separation of model from file format 

GSDD benefited by explicitly defining the data model for 
single-dish observing distinct from bytes on the disk. How¬ 
ever, whether by accident or design, the GSDD standard re¬ 
sulted in multiple software implementations writing the data to 
disk in different formats and using different techniques. The 
JCMT GSD format was never written on anything other than a 
VAX but the NRAO format migrated from PDFL to SDD go¬ 
ing from VAX to Unix. Unfortunately these multiple formats 
also meant that data reduction software wishing to read the data 
would need to implement multiple file readers. The reality is 
this work was never done. Given the focus of both institutions 
on the use of GSDD in data acquisition using different hard¬ 
ware platforms and different performance constraints, this split 
is not surprising, but it is interesting to contemplate how inter¬ 
operability would have improved if the standards effort had also 
included the definition of an interchange format. Being easily 
able to compare a JCMT spectrum with a NRAO 12-m spec¬ 
trum from within the same data analysis package would have 
been extremely useful to the young sub-mm community. 

5.6. A success apart 

Despite the lack of communication between implementors 
and the drift in specifications, the GSDD format itself can be 
thought of as a success when the uses of the format are looked 
at independently. The JCMT GSD format was used for many 
years and files in this format are still available. The related 
format continues to be used at the 12-m Telescope. 

5.7. Feeder for SDFITS 

GSDD was a very early attempt for independently funded 
and operated observatories to agree on a shared data model. 
The goals of true interoperability of raw telescope data amongst 
multiple data reduction software packages was an important 
goal that was ahead of its time. Arguably the key outcome of 
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GSDD was that it motivated people to work together towards a 
shared data format based on FITS. The GSDD experience fed in 
to a 1989 workshop held at Green Bank in late 1989" that dis¬ 
cussed how the community could migrate to a single-dish FITS 
format. This was a key motivator for the adoption of binary ta¬ 
bles into the FITS standard (Cotton et al., 1995) and ultimately 
led to the SDFITS standard (Garwood, 2000). 

5.8. Communication 

A failing of GSDD is that when developers had real, prac¬ 
tical reasons to break a rule (e.g., needing a double precision 
word for a pre-defined keyword when the standard required sin¬ 
gle precision, a string needing 32 char instead of 16, changing 
the byte representation from that of a VAX to IEEE), a forum 
had not been set up that could negotiate modifications to the 
standard. This is unlike the EITS world where revisions to the 
definition have to pass through a standards group. A key les¬ 
son is that when a standard is set up, the agreement should go 
beyond the expectation that ad hoc conversations between staff 
at different observatories are a sufficient means of keeping the 
standard viable. 

The JCMT GSD library was documented and stable and the 
UK had the Starlink Project (Disney and Wallace, 1982) to pub¬ 
lish the software and data files to the UK community. However, 
access to that network from other countries, such as the US, 
was problematic, and hindered the spread of the software and 
prevented take up. Eears of lack of support also drove people to 
create their own in-house solutions. 

Today, 30 years on, the Internet and the culture of open- 
source development make that much less likely and and good 
ideas have a tendency to become distributed and generate a 
supporting community outside of the original developers that 
ensures its survival and growth. 

6. Thoughts on the Future 

Many of the lessons exposed by the history of GSDD have al¬ 
ready been learned in the 30 years since the key decisions were 
made and much improved communications infrastructure has 
changed the way that people work. The current debate on fu¬ 
ture developments of data formats for astronomy (see e.g. Mink 
et al., 2015; Mink, 2015; Thomas et al., 2015) indicates that 
there is a desire within the community for a format that builds 
on the lessons learned using the EITS format to develop a for¬ 
mat with more modem underpinnings. As noted in the debate 
described in Mink et al. (2015), representing data on disk is be¬ 
coming a secondary concern relative to the discussion of data 
models. A data model can be serialized into many different 
transport and archive formats, and it is relatively easy to make 
applications flexible enough to be able to cope with these dif¬ 
ferences. Instead, it is much harder to deal with different data 
models and implementation efforts should concentrate on opti¬ 
mizing and generalizing the data model that is being used. This 
is, after all, the underlying business logic that enables science 
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to progress. It may be true that all data models can be repre¬ 
sented in a EITS file but that doesn’t mean that a EITS file is 
the most compact or most efficiently accessed format. Chang¬ 
ing the underlying file format used in astronomy may simplify 
infrastructure libraries and result in new abilities not available 
from within EITS. The easiest way to migrate people to a new 
format may well be to do it without people knowing what un¬ 
derlying format really is being used by their applications. As 
we move forward with discussions on data formats and look 
again at hierarchical approaches (e.g. Greenfield et al., 2015; 
Jenness, 2015; Price et al., 2015), these may adjust the way that 
people view data models. A hierarchical view is very different 
to a flat view and data modelers should not be constrained by 
how their models are represented on disk. 

GSDD failed to unify the single-dish radio telescope com¬ 
munity to use a single file format. Eocusing on the data model 
as a first step was the correct decision at the time but it was 
poorly implemented with little buy-in from the people writing 
the software. Eailing to agree on units, coordinate codes and the 
approach to adding additional keywords removed any chance 
of GSDD being a generically useful data model for the com¬ 
munity. Ideally a GSDD data model library should have been 
written to abstract the file format completely from the user, but 
this was all occuring before object-oriented programming was 
a common paradigm. If GSDD were being implemented now 
it would be obvious how to wrap data representing millimetre 
observations within object-oriented classes involving differing 
receiver types and observing modes. 

Abstracting the data model from the underlying file format is 
an idea whose time has come. The Large Synoptic Survey Tele¬ 
scope data management system (Ivezic et al., 2008; Kantor and 
Axelrod, 2010) uses a butler to mediate file access. The user 
requests data from the system and the butler then pulls all the 
relevant data items together (from a database or from files or 
from a combination of the two) and instantiates an object rep¬ 
resenting that data. Eor LSST this Exposure class represents 
something relevant to an optical imager, but it could just as eas¬ 
ily return an object that is relevant to millimeter observing. 

7. Conclusions 

The GSDD data model was used at NRAO and JCMT for 
many years but failed in its original goal of unifying single dish 
millimeter astronomy and simplifying data reduction software 
reuse. As data reduction packages have evolved it has become 
clear that the most important aspect of such packages is for¬ 
mat conversion such that the software can map the external 
data model to an internal data model. It is very hard to mo¬ 
tivate individual observatories to target a global standard for 
raw data without significant commitment and obvious return on 
investement. NRAO and JCMT made a solid attempt but could 
not maintain the momentum as other priorities intervened and 
staff involved in the effort moved to other projects. Recent ex¬ 
amples where observatories have collaborated on a shared raw 
data format (e.g. MBEITS; Muders et al., 2006) has shown that 
this is possible but depends critically on the motivation of indi- 
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viduals and on available funding'^ Interoperability of reduced 
data products has significantly improved since the mid-1980s 
such that there is a general expectation that reduced data cubes 
will be viewable in general tools. By contrast, interoperability 
of raw data has remained a much more elusive goal, at least 
amongst the sub-mm radio telescope community. 
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